Improving Example Based Machine Translation Through Morphological Generalization and Adaptation

نویسندگان

  • Aaron B. Phillips
  • Violetta Cavalli-Sforza
  • Ralf D. Brown
چکیده

Example Based Machine Translation (EBMT) is limited by the quantity and scope of its training data. Even with a reasonably large corpus, we will not have examples that cover everything we want to translate. This problem is especially severe in Arabic due to its rich morphology. We demonstrate a novel method that exploits the regular nature of Arabic morphology to increase the quality and coverage of machine translation. Through the use of generalization and rewrite rules, we are able to recover the English translation of phrases that do not exist in the training corpora. Furthermore, this system shows improvement in BLEU even with a training corpus of 1.4 million sentence pairs.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Arabic-to-English Example Based Machine Translation Using Context-Insensitive Morphological Analysis

W e describe and discuss the results of ongoing experim ents that use morphological analysis in the context of Example-Based M achine Translation. The goal is to increase the coverage of our training examples so as to capture things that are not directly seen in the training text. This is done through a two stage process of generalization and filtering.

متن کامل

A Systematic Adaptation Scheme for English-Hindi Example-Based Machine Translation

The success of Example-Based Machine Translation (EBMT) often depends upon how efficient the adaptation scheme is. Adaptation primarily aims at modifying retrieved examples to meet the required demands of a given translation task. The present work looks at adaptation for EBMT from English to Hindi. This paper describes a rule-driven adaptation scheme for modifying a retrieved translation exampl...

متن کامل

Domain Adaptation Through Phrase Generalization for Improved Statistical Machine Translation Quality

This paper presents a method for domain adaptation (incorporating out-of-domain data) through phrase generalization (learning/using phrase templates) in order to improve the Italian-English translation quality on the BTEC travel task. The process of phrase generalization is described, and its inclusion in the system resulted in noticeable, but only minor improvements because of alignment proble...

متن کامل

Monolingual Machine Translation The Tenth Biennial Conference of the Association for Machine Translation in the Americas AMT 2012 20 A Years Tsuyoshi Okita

This paper presents a detailed study of a method for morphology generalization and generation to address out-of-domain translations in English-to-Spanish phrase-based MT. The paper studies whether the morphological richness of the target language causes poor quality translation when translating out-ofdomain. In detail, this approach first translates into Spanish simplified forms and then predic...

متن کامل

A Speci c Least General Generalization of Strings and Its Application to Example Based Machine Translation

Since the least general generalization LGG of strings may cause an over generalization in the generalization process of clauses we propose a speci c least general generalization SLGG of strings to reduce over generalization To create a SLGG of two strings rst a minimal match sequence between these strings is found A minimal match sequence of two strings consists of similarities and di erences t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007